6.3 Signals of the Cardiovascular System
|
281
In many applications meaningful features are extracted from the raw data to form
feature vectors. Mathematically a feature is an individual measurable property or char-
acteristic of a phenomenon and a feature vector is a n-dimensional vector of numer-
ical features that represent some object [3]. The goal of the classification problems is to
deduce the correct class of an object from its features using appropriate classification
algorithms or short classifiers. The result of a classification is therefore the assign-
ment of a class to each object. As it can happen that a classifier assigns not the correct
class to an object, there are several measures to control the quality of a classifier. The
most important measures are sensitivity, specificity and accuracy. For simplicity, let
us assume that there are two different classes, diseased (A) and healthy (CG). Then
the sensitivity indicates the percentage of a test that detects the disease in people who
are actually ill, whereas specificity indicates the percentage of a test that classifies
healthy people correctly under all healthy people. The accuracy finally is the overall
percentage of the correct assigned classes. The greater the sensitivity, specificity and
accuracy, the better the classifier.
In most cases, the data set of objects is divided in a training and a test set when
a classification is performed. The training data serve to deduce rules for the classific-
ation of the objects. These rules are then applied to the objects in the test set. As the
results depend on the choice of the split in training and test set, the so-called k-fold-
cross-validation is applied: The set is divided in k-sets of equal sizes. In a first step,
the first set is taken as test set, the other sets as training sets. In a second step, the
second set is taken as test set, the others as training sets, and so on. After training the
classifier is tested and the overall quality measures are calculated from those of the
individual steps.
In the above clinical case of the given photoplethysmographic data, several sorts
of coefficients are evaluated as feature vector, however the best results were received
for the coefficients from the frequency response approach:
Let F(ki) be the element of the fast Fourier-transform corresponding to the dis-
crete frequency ki where only complex features for the harmonic frequencies are com-
puted. This is motivated by the observation that there is a clearly visible periodicity
in the spectrum of the PPG-signals and the hypothesis that the effect of an aneurysm
manifest in the periodical properties of the signal and not necessarily in the aperiod-
ical. Therefore the first five harmonic frequencies are extracted by Matlab’s findpeaks
from the absolute values of the corresponding spectrum. These frequencies align in
almost all cases perfectly for input and output signal, if there is a slight deviation
(ki,in
̸= ki,out) the features for the i-th peak are still divided to the resulting coefficients
Hi =
Fin(ki,in)
Fout(ki,out) .
This is done for five intervals of ten seconds and the resulting values (real and complex
parts) are averaged. For the sake of simplicity, we will only consider the case that the
input signal is that at the right thumb and the output signal is that at the right toe.